Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

Training an Image Classification Multi-Class model using AutoML

In this notebook, we go over how you can use AutoML for training an Image Classification Multi-Class model. We will use a small dataset to train the model, demonstrate how you can tune hyperparameters of the model to optimize model performance and deploy the model to use in inference scenarios. For detailed information please refer to the documentation of AutoML for Images.

img

Important: This feature is currently in public preview. This preview version is provided without a service-level agreement. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Environment Setup

Please follow the "Setup a new conda environment" instructions to get started.

Workspace setup

In order to train and deploy models in Azure ML, you will first need to set up a workspace.

An Azure ML Workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML Workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, deployment, inference, and the monitoring of deployed models.

Create an Azure ML Workspace within your Azure subscription or load an existing workspace.

Compute target setup

You will need to provide a Compute Target that will be used for your AutoML model training. AutoML models for image tasks require GPU SKUs such as the ones from the NC, NCv2, NCv3, ND, NDv2 and NCasT4 series. We recommend using the NCsv3-series (with v100 GPUs) for faster training. Using a compute target with a multi-GPU VM SKU will leverage the multiple GPUs to speed up training. Additionally, setting up a compute target with multiple nodes will allow for faster model training by leveraging parallelism, when tuning hyperparameters for your model.

Experiment Setup

Create an Experiment in your workspace to track your model training runs

Dataset with input Training Data

In order to generate models for computer vision, you will need to bring in labeled image data as input for model training in the form of an AzureML Tabular Dataset. You can either use a dataset that you have exported from a Data Labeling project, or create a new Tabular Dataset with your labeled training data.

In this notebook, we use a toy dataset called Fridge Objects, which consists of 134 images of 4 classes of beverage container {can, carton, milk bottle, water bottle} photos taken on different backgrounds.

All images in this notebook are hosted in this repository and are made available under the MIT license.

We first download and unzip the data locally.

This is a sample image from this dataset:

Convert the downloaded data to JSONL

In this example, the fridge object dataset is stored in a directory. There are four different folders inside:

This is the most common data format for multiclass image classification. Each folder title corresponds to the image label for the images contained inside.

In order to use this data to create an AzureML Tabular Dataset, we first need to convert it to the required JSONL format. Please refer to the documentation on how to prepare datasets.

The following script is creating two .jsonl files (one for training and one for validation) in the parent folder of the dataset. The train / validation ratio corresponds to 20% of the data going into the validation file.

Upload the JSONL file and images to Datastore

In order to use the data for training in Azure ML, we upload it to our Azure ML Workspace via a Datastore. The datastore provides a mechanism for you to upload/download data and interact with it from your remote compute targets. It is an abstraction over Azure Storage.

Finally, we need to create an AzureML Tabular Dataset from the data we uploaded to the Datastore. We create one dataset for training and one for validation.

Validation dataset is optional. If no validation dataset is specified, by default 20% of your training data will be used for validation. You can control the percentage using the split_ratio argument - please refer to the documentation for more details.

This is what the training dataset looks like:

Configuring your AutoML run for image tasks

AutoML allows you to easily train models for Image Classification, Object Detection & Instance Segmentation on your image data. You can control the model algorithm to be used, specify hyperparameter values for your model as well as perform a sweep across the hyperparameter space to generate an optimal model. Parameters for configuring your AutoML Image run are specified using the AutoMLImageConfig - please refer to the documentation for the details on the parameters that can be used and their values.

When using AutoML for image tasks, you need to specify the model algorithms using the model_name parameter. You can either specify a single model or choose to sweep over multiple models. Please refer to the documentation for the list of supported model algorithms.

Using default hyperparameter values for the specified algorithm

Before doing a large sweep to search for the optimal models and hyperparameters, we recommend trying the default values for a given model to get a first baseline. Next, you can explore multiple hyperparameters for the same model before sweeping over multiple models and their parameters. This allows an iterative approach, as with multiple models and multiple hyperparameters for each (as we showcase in the next section), the search space grows exponentially, and you need more iterations to find optimal configurations.

If you wish to use the default hyperparameter values for a given algorithm (say vitb16r224), you can specify the config for your AutoML Image runs as follows:

Submitting an AutoML run for Computer Vision tasks

Once you've created the config settings for your run, you can submit an AutoML run using the config in order to train a vision model using your training dataset.

Hyperparameter sweeping for your AutoML models for computer vision tasks

In this example, we use the AutoMLImageConfig to train an Image Classification model using the following model algorithms: seresnext, resnet50, vitb16r224, and vits16r224.

When using AutoML for Images, you can perform a hyperparameter sweep over a defined parameter space to find the optimal model. In this example, we sweep over the hyperparameters for each algorithm, choosing from a range of values for learning_rate, number_of_epochs, layers_to_freeze, etc., to generate a model with the optimal 'accuracy'. If hyperparameter values are not specified, then default values are used for the specified algorithm.

We use Random Sampling to pick samples from this parameter space and try a total of 10 iterations with these different samples, running 2 iterations at a time on our compute target, which has been previously set up using 4 nodes. Please note that the more parameters the space has, the more iterations you need to find optimal models.

We leverage the Bandit early termination policy which will terminate poor performing configs (those that are not within 20% slack of the best performing config), thus significantly saving compute resources.

For more details on model and hyperparameter sweeping, please refer to the documentation.

When doing a hyperparameter sweep, it can be useful to visualize the different configurations that were tried using the HyperDrive UI. You can navigate to this UI by going to the 'Child runs' tab in the UI of the main automl_image_run from above, which is the HyperDrive parent run. Then you can go into the 'Child runs' tab of this HyperDrive parent run. Alternatively, here below you can see directly the HyperDrive parent run and navigate to its 'Child runs' tab:

Register the optimal vision model from the AutoML run

Once the run completes, we can register the model that was created from the best run (configuration that resulted in the best primary metric)

Deploy model as a web service

Once you have your trained model, you can deploy the model on Azure. You can deploy your trained model as a web service on Azure Container Instances (ACI) or Azure Kubernetes Service (AKS). Please note that ACI only supports small models under 1 GB in size. For testing larger models or for the high-scale production stage, we recommend using AKS. In this tutorial, we will deploy the model as a web service in AKS.

You will need to first create an AKS compute cluster or use an existing AKS cluster. You can use either GPU or CPU VM SKUs for your deployment cluster

Next, you will need to define the inference configuration, that describes how to set up the web-service containing your model. You can use the scoring script and the environment from the training run in your inference config.

Note: To change the model's settings, open the downloaded scoring script and modify the model_settings variable before deploying the model.

You can then deploy the model as an AKS web service.

Test the web service

Finally, let's test our deployed web service to predict new images. You can pass in any image. In this case, we'll use a random image from the dataset and pass it to the scoring URI.

Scoring

Inference on single image in bytes format

Inference on single image in base64 format

Inference on multiple images in base64 format

Visualize predictions

Now that we have scored a test image, we can visualize the prediction for this image

Generate Explanations

Generate explanations for one input image in base64 format

Generate explanations for multiple images in base64 format

Visualize Explanations

Now that we have generated explanations on a test image, we can visualize the explanations for this image